Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Provide support for batch scraping #2459

Merged
merged 136 commits into from
Oct 7, 2024

Conversation

hkfgo
Copy link
Collaborator

@hkfgo hkfgo commented Mar 23, 2024

Hey Tom, hope all is well!

I've attached this diagram to illustrate my proposed change.
Screenshot 2024-03-22 at 10 23 21 PM

How to Batch?

Scrape jobs are currently defined per <resource, metric>. I'm taking the simplest approach to just batch resources looking for the same metric.

Max batch size will be a configurable parameter, under runtime configuration.

Batch Job Execution Flow

Screenshot 2024-08-21 at 5 37 25 PM

Changes to Internal Metrics

Internal Promitor metrics will need to be modified under batch scraping mode.
promitor_scrape_success and promitor_scrape_error metrics should be appended with an additional tag batched, to indicate the mode of operation(0/1)
A histogram metric promitor_batch_size should be added to track actual batch sizes
promitor_ratelimit_arm_throttled/promitor_ratelimit_arm metrics will no longer be published, since batch scraping does not go through ARM.

Other considerations

Batch scraping will not be free anymore but comes with other benefits(like no longer being subject to ARM throttling). This should be documented for all users

Implements #2284

@hkfgo hkfgo requested a review from tomkerkhove as a code owner March 23, 2024 05:30
@trafico-bot trafico-bot bot added the 🔍 Ready for Review Pull Request is not reviewed yet label Mar 23, 2024
Copy link

Thank you for your contribution! 🙏 We will review it as soon as possible.

@tomkerkhove
Copy link
Owner

Thanks! Is it safe to say that the top of the diagram is current and bottom shows these changes?

@tomkerkhove tomkerkhove marked this pull request as draft April 4, 2024 10:50
@tomkerkhove tomkerkhove changed the title draft: enable batch scraping feat: Provide support for batch scraping Apr 4, 2024
Copy link
Owner

@tomkerkhove tomkerkhove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, added a few remarks but no concerns!

@hkfgo
Copy link
Collaborator Author

hkfgo commented May 9, 2024

Hey I'm revisiting this PR now since SDK migration is almost across the finish line. Do you know if the batch scraping API is available in sovereign clouds and what URLs we should use? We are particularly interested in the US Gov cloud. I can't find any information online(Azure doesn't really mentioned the existence of sovereign clouds to begin with)

@tomkerkhove
Copy link
Owner

Do you know if the batch scraping API is available in sovereign clouds and what URLs we should use?

I do not know, sorry! What is the URL for public cloud?

@tomkerkhove
Copy link
Owner

@hkfgo
Copy link
Collaborator Author

hkfgo commented May 30, 2024

Awesome! How did you find this out?

@tomkerkhove
Copy link
Owner

I reached out to the team :D

@tomkerkhove
Copy link
Owner

@hkfgo Do you know if you'll have time to finalize this PR next week? I'd like to cut a release next week.

@hkfgo
Copy link
Collaborator Author

hkfgo commented Jun 1, 2024

@hkfgo Do you know if you'll have time to finalize this PR next week? I'd like to cut a release next week.

We should plan for batch scraping to be on the release after this coming one. There are still a decent amount of work left and I'm not able to work on this full time.

I do have time to get SDK migration done though. In that case we should have something major-ish to release :)

@tomkerkhove
Copy link
Owner

That's OK - Thank you!

@tomkerkhove
Copy link
Owner

@hkfgo Any thoughts on this PR? Are you able to continue working on it?

@hkfgo
Copy link
Collaborator Author

hkfgo commented Jul 15, 2024

@hkfgo Any thoughts on this PR? Are you able to continue working on it?

Hey I got caught up in some other work and took a few weeks PTO. I'm back working on it now :)

@hkfgo
Copy link
Collaborator Author

hkfgo commented Jul 15, 2024

Any thoughts on this btw: #2459 (comment)?

Copy link
Owner

@tomkerkhove tomkerkhove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, two more comments

@tomkerkhove
Copy link
Owner

Great! We're ready to go, let me know when the CI is passing 👌

@hkfgo
Copy link
Collaborator Author

hkfgo commented Oct 2, 2024

Great! We're ready to go, let me know when the CI is passing 👌

I think this is okay to merge as is, since we are pretty sure now that the reason for failure is test flakiness. The only failing part of CI are the Scraper OTEL Collector tests on Linux(WIndows image built just fine). I don't mind retrying every now and then but might be better to fix the flakiness through a separate PR. Wdyt?

@hkfgo
Copy link
Collaborator Author

hkfgo commented Oct 2, 2024

Oh hey the CI's passed after a few retries

@hkfgo
Copy link
Collaborator Author

hkfgo commented Oct 7, 2024

Ugh sorry I just realized I hadn't resolved all comments. Should be good to merge now

@tomkerkhove
Copy link
Owner

Thanks a ton for all this hard work!

@tomkerkhove tomkerkhove merged commit 5fa547b into tomkerkhove:master Oct 7, 2024
26 of 27 checks passed
tomkerkhove added a commit that referenced this pull request Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔍 Ready for Review Pull Request is not reviewed yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants